MGE Big Data Technology
MGE Big Data Technology
Long Keping Professor of the University of Science and Technology Beijing, recipient of the National Science Fund for Distinguished Young Scholars, and Yangtze River Scholar
Zhang Xiaotong Professor of the University of Science and Technology Beijing
Zhang Haijun Professor of the University of Science and Technology Beijing, and recipient of the National Outstanding Youth Science Fund
Zhang Dawei Professor of the University of Science and Technology Beijing
Huang Fuwei Associate Professor of the University of Science and Technology Beijing
【Chief Members】
Long Keping Professor of the University of Science and Technology Beijing,
Zhang Xiaotong Professor of the University of Science and Technology Beijing
Zhang Haijun Professor of the University of Science and Technology Beijing
Zhang Dawei Professor of the University of Science and Technology Beijing
Huang Fuwei Associate Professor of the University of Science and Technology Beijing
【Research Background】
China’s “Marine Development” and “One Belt, One Road” initiatives are placing greater requirements on materials science. However, China currently is highly dependent on imports for key core materials and has a very limited number of material resource databases that are lacking in diversity. Moreover, there is a long interval between the time that R&D is done on new materials and the time of their application. With the development of cloud computing and big data, it has become possible to establish a database platform dedicated to Materials Genome Engineering (MGE) by designing a material big data sharing network based on caching and edge computing. The aim is not only to develop new materials and technologies efficiently and to accelerate industrial applications of new materials, but also to utilize fully a huge amount of material data resources. Used in conjunction with big data technology, these steps are intended to improve the properties of existing materials and to speed up the discovery of new materials. In this way, there should be a gradual creation and continuous enhancement of the ability for new materials to be “Made-in-China.”
【Research Objectives】
There is need for intelligent acquisition, transmission, and application of real-time data related to such as corrosion, fatigue, fracture, and aging during material service. To this end, the team is committed to researching and developing safe transmission technology able to secure dense, multi-source material big data, to solve the problems of storage and sharing of heterogeneous material data. This includes realizing long-term sensory monitoring and data transmission over a large geographical range and overcoming the bottleneck of informationized representation and standardized description of multi-dimensional material data. Establishing dense, multi-source intelligent data collection technology is needed to provide the big data on material corrosion for applications based on the Internet of Things (IoT), for data transmission technology based on cloud computing, and for big data mining technology based on machine learning. The goal is to form a platform for material corrosion data collection and a dynamic container-based database system dedicated to MGE. This system is expected to manage and to utilize material big data efficiently, to accelerate the development of high-quality new materials, and to accelerate the industrial application of new materials while ensuring their safe use.
【Main research areas】
1. Intelligent acquisition technology for multi-source dense corrosion data
2. Highly reliable IoT data transmission technology based on cloud computing
3. Machine learning-based data mining technology for corrosion big data of materials
4. Dynamic containers with user-defined data storage structure
5. Dynamic container-based database system dedicated to MGE
【Significant Research Progress】
1. IoT-based material corrosion data collection platform
The team designed a cloud computing-based IoT structure that allows intelligent collection and edge caching of dense multi-source data. They solved the problem of collecting, storing, and transmitting material big data within large geographical ranges over the long term, and designed an IoT data transmission and service mechanism with low time-delay and high reliability for material big data. To ensure real-time monitoring of material failure behaviors such as corrosion, fatigue, fracture, and aging during material service, the team proposed a service mechanism for edge caching and effective transmission of dense, multi-source data based on cloud computing. The team aims to make full use of 5G ultra-dense networking technology to enhance the openness of the IoT, expand its coverage, and enhance data collection capability for material big data. Meanwhile, the edge characteristics of cloud computing will be utilized effectively to improve the working performance of network terminals so as to utilize maximally the edge storage space for analysis and for mining of dense multi-source data.
Figure 1 IoT-based big data network on material corrosion
2. MGE-dedicated database platform
To overcome obstacles to the storing and sharing of multi-source heterogeneous material data, the team researched and developed dynamic containers with user-defined data storage structure (Figure 2). They also solved the problem of informationized representation and standardized description of multi-source heterogeneous material data. Then, they researched and developed a dynamic container-based, MGE-dedicated database system that integrates the collection, storage, query, and display of data in an all-in-one manner. To achieve intelligent search and push through the platform, a method was developed that is able to extract an entity relationship diagram (ERD) of data storage and is suitable for intelligent search and push (Fig. 3). Moreover, it is easy to implement and has good operability. To achieve intellectual property identification of the platform data, Digital Object Identifiers (DOIs) were adopted for the platform. There is also a technology that can automatically generate multiple DOIs for data objects under the scientific data service platform (Fig. 4). Not only can this be used to apply for multiple types of DOIs for material data in material databases, but it also has strong scalability. That is, it has the theoretical ability to register scientific data with any kind of DOI in any quantity. This system is easy to implement with good operability, and also provides a review mechanism for the registration of multiple DOIs.
|
|
|
Figure 2 Dynamic container technology | Figure 3 Method able to extract the ERD of data storage and suitable for intelligent search and push | Figure 4 Technology to automatically generate multiple DOIs for data objects under the scientific data service platform |